Character Soptting of Historical Documents Using Pattern Segmentation Aided by Recognition Processing
                    
                        
                            نویسندگان
                            
                            
                        
                        
                    
                    
                    چکیده
منابع مشابه
Integrating Optical Character Recognition and Machine Translation of Historical Documents
Machine Translation (MT) plays a critical role in expanding capacity in the translation industry. However, many valuable documents, including digital documents, are encoded in non-accessible formats for machine processing (e.g., Historical or Legal documents). Such documents must be passed through a process of Optical Character Recognition (OCR) to render the text suitable for MT. No matter how...
متن کاملSemi-supervised learning for character recognition in historical archive documents
Training recognizers for handwritten characters is still a very time consuming task involving tremendous amounts of manual annotations by experts. In this paper we present semi-supervised labeling strategies that are able to considerably reduce the human effort. We propose two different methods to label and later recognize characters in collections of historical archive documents. The first one...
متن کاملCompression of binary documents using pattern recognition
To preserve all the information in analog documents when they are digitized , high resolution scanning of them has become common. When the ana-log document has a large physical size, which is frequently the case, the resulting digital image occupies a large storage space. Hence, methods that compress document images substantially without loss of signi-cant information are of great practical imp...
متن کاملCharacter Recognition Without Segmentation
A segmentation-free approach to OCR is presented as part of a knowledge-based word interpretation model. This new method is based on the recognition of subgraphs homeomorphic to previously defined prototypes of characters [16]. Gaps are identified as potential parts of characters by implementing a variant of the notion of relative neighborhood used in computational perception. In the system, ea...
متن کاملAn Arabic optical character recognition system using recognition-based segmentation
Optical character recognition (OCR) systems improve human}machine interaction and are widely used in many areas. The recognition of cursive scripts is a di$cult task as their segmentation su!ers from serious problems. This paper proposes an Arabic OCR system, which uses a recognition-based segmentation technique to overcome the classical segmentation problems. A newly developed Arabic word segm...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEJ Transactions on Electronics, Information and Systems
سال: 2002
ISSN: 0385-4221,1348-8155
DOI: 10.1541/ieejeiss1987.122.11_1876